Skip to content

Conversation

@hawkw
Copy link
Member

@hawkw hawkw commented Dec 8, 2025

Cases, as described in RFD 603, give diagnosis engines a mechanism to group together ereports (and eventually, other fault management data) as part of a sitrep. This branch adds database tables for representing cases, and for associating ereports with cases. Subsequent branches will add a mechanism for sitreps to request alerts, which will also be associated with cases.

This branch was factored out from #9346, where I've been prototyping a diagnosis engine implementation. Although aspects of how the DEs themselves will work are still in flux, I think that cases, along with other parts of the database machinery such as the impending alert-requests branch, are pretty much nailed down.

@hawkw hawkw requested review from jgallagher and smklein December 8, 2025 23:45
@hawkw hawkw added the fault-management Everything related to the fault-management initiative (RFD480 and others) label Dec 8, 2025
@hawkw hawkw self-assigned this Dec 8, 2025
@hawkw
Copy link
Member Author

hawkw commented Dec 9, 2025

This failure installing rustc components in the helios / build TUF repo job is almost certainly not my fault; I've restarted it


comment TEXT NOT NULL,

PRIMARY KEY (sitrep_id, id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; I would have figured the id would be the PK, but I'm not opposed to it being (sitrep_id, id) either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ID can't be the PK on its own since we would like to be able to insert a new sitrep non-transactionally without having removed the previous one, and we need to know the sitrep ID so that we can delete the records associated with it when GCing old/orphaned sitreps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, hmm, perhaps there is a way to have these records not contain the current sitrep ID, and GC them separately? that might make this table a lot smaller. i'd have to think about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more, yeah, I think I'd rather leave this as-is. I was wondering if there was a way to just ON CONFLICT DO UPDATE the sitrep ID for these when inserting a new sitrep, but that would make them potentially disappear if a new sitrep is inserted but never made current. I think I'm going to leave this as is and just allow them to be copied a bunch and deleted by sitrep GC normally.

@hawkw hawkw requested a review from smklein December 16, 2025 19:33
.map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))
}

fn fm_sitrep_read_ereports_query(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it took a ton of thought and effort (and db restructuring) to get here, but this ended up looking very simple - nicely done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Your advice was very helpful here. :)

@hawkw hawkw enabled auto-merge (squash) December 16, 2025 21:49
@hawkw
Copy link
Member Author

hawkw commented Dec 16, 2025

CI failure for build-and-test (helios) sure looks like a flake, as the Linux test run passed and it's in a totally unrelated saga test. Gonna re-run before reporting it as flaky.

@hawkw hawkw merged commit d385dba into main Dec 17, 2025
16 checks passed
@hawkw hawkw deleted the eliza/fm-cases branch December 17, 2025 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fault-management Everything related to the fault-management initiative (RFD480 and others)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants